Language model selection based on the analysis of Japanese spontaneous speech on travel arrangement task

نویسندگان

  • Akira Kurematsu
  • Atsushi Sukenori
چکیده

This paper deals with the issue of language model selection based on the analysis of data collection for spontaneous speech in Japanese in the travel arrangement task which contains five different subtasks. The procedure of transcription and segmentation of the Japanese spontaneous speech in Romanized transcription is described. The use of topic-dependent separated language model were evaluated in calculating the perplexity and applying it into Japanese speech recognition of the travel arrangement task corpus. The reduction of perplexity was shown and the increase of speech recognition was performed by use of the subtopic language model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Collection and Transliteration of Japanese Spontaneous Database in the Travel Arrangement Task Domain

This paper describes the method to construct and transcribe Japanese spontaneous speech data for VERBMOBIL, the German research project of speech translation.. Spontaneous spoken dialogue database is the basis for developing speech and language processing for dialogue systems such as speech translation system. The extended data of human-to-human spoken dialogue in the scenario of travel arrange...

متن کامل

Toward translating Korean speech into other languages

This paper describes research activities of ETRI in multi-lingual spontaneous speech translation. We have developed Korean-toEnglish, Korean-to-Japanese speech translation system prototype that includes 5,000 word spontaneous Korean speech recognizer, Korean-English and Korean-Japanese translators, and Korean speech synthesizer with spontaneous prosody in the travel planning task. We utilize mu...

متن کامل

The relationship between task repetition and language proficiency

Task  repetition  is  now  considered  as  an  important  task-based  implementation  variable  which can affect complexity, accuracy, and fluency of L2 speech. However, in order to move towards theorizing  the  role  of  task  repetition  in  second  language  acquisition,  it  is  necessary  that individual variables be taken into account. The present study aimed to investigate the way task r...

متن کامل

Selection of Multi-Word Expressions from Web N-gram Corpus for Speech Recognition

This paper proposes a method for constructing a statistical language model with multi word expressions (MWEs) selected from Google Japanese Web N-gram. MWEs are concatenated words that consist of idiomatic expressions or long-length morpheme sequences used frequently. In this paper a method for selecting the effective MWEs that improve the language model based on co-occurrence probabilities of ...

متن کامل

The Relationship between Iranian EFL Learners’ Ambiguity Tolerance and the Accuracy of Their Task-based Oral Speech

Various individual differences, including ambiguity tolerance (AT), have gained momentum because of the influence they can exert on the process and product of learning, and thereby, on various aspects of the learner’s interlanguage system such as accuracy of oral speech. The present study was undertaken to examine the extent to which Iranian EFL learners’ AT was significantly correlated with th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999